The Open Provenance Model: An Overview

نویسندگان

  • Luc Moreau
  • Juliana Freire
  • Joe Futrelle
  • Robert E. McGrath
  • James D. Myers
  • Patrick R. Paulson
چکیده

Provenance is well understood in the context of art or digital libaries, where it respectively refers to the documented history of an art object, or the documentation of processes in a digital object’s life cycle. Interest for provenance in the “e-science community” [12] is also growing, since provenance is perceived as a crucial component of workflow systems that can help scientists ensure reproducibility of their scientific analyses and processes [2,4]. Against this background, the International Provenance and Annotation Workshop (IPAW’06), held on May 3-5, 2006 in Chicago, involved some 50 participants interested in the issues of data provenance, process documentation, data derivation, and data annotation [7]. During a session on provenance standardization, a consensus began to emerge, whereby the provenance research community needed to understand better the capabilities of the different systems, the representations they used for provenance, their similarities, their differences, and the rationale that motivated their designs. Hence, the first Provenance Challenge [1] was born, and from the outset, the challenge was set up to be informative rather than competitive. The first Provenance Challenge was set up in order to provide a forum for the community to understand the capabilities of different provenance systems and the expressiveness of their provenance representations. Participants simulated or ran a Functional Magnetic Resonance Imaging workflow, from which they implemented and executed a pre-identified set of “provenance queries”. Sixteen teams responded to the challenge, and reported their experience in a journal special issue [9]. The first Provenance Challenge was followed by the second Provenance Challenge [1], aiming at establishing inter-operability of systems, by exchanging provenance information. During discussions, the thirteen teams that responded to the second challenge found out that there was substantial agreement on a core representation of provenance. As a result, following a workshop in August 2007, in Salt Lake City, a data model was crafted by the authors and released as the Open Provenance Model (OPM v1.00) [8]. On June 19th 2008, some twenty participants attended the first OPM workshop, held after IPAW’08 [3], to discuss the OPM specification. Minutes of the workshop and recommendations [5] were published, and led to the current version (v1.01) of the Open Provenance Model [10].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Provenance-based Access Control Models Approved by Supervising Committee:

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1....

متن کامل

SPADE: Support for Provenance Auditing in Distributed Environments

SPADE is an open source software infrastructure for data provenance collection and management. The underlying data model used throughout the system is graph-based, consisting of vertices and directed edges that are modeled after the node and relationship types described in the Open Provenance Model. The system has been designed to decouple the collection, storage, and querying of provenance met...

متن کامل

Provenance management in Swift

The Swift parallel scripting language allows for the specification, execution and analysis of large-scale computations in parallel and distributed environments. It incorporates a data model for recording and querying provenance information. In this article we describe these capabilities and evaluate interoperability with other systems through the use of the Open Provenance Model. We describe Sw...

متن کامل

Mapping the NRC Dataflow Model to the Open Provenance Model

The Open Provenance Model (OPM) has recently been proposed as an exchange framework for workflow provenance information. In this paper we show how the NRC data model for workflow repositories can be mapped to the OPM. Our mapping includes such features as complex data flow in an execution of a workflow; different workflows in the repository that call each other; and the tracking of subvalues of...

متن کامل

GProM - A Swiss Army Knife for Your Provenance Needs

We present an overview of GProM, a generic provenance middleware for relational databases. The system supports diverse provenance and annotation management tasks through query instrumentation, i.e., compiling a declarative frontend language with provenance-specific features into the query language of a backend database system. In addition to introducing GProM, we also discuss research contribut...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008